639 research outputs found

    On The Stability of Interpretable Models

    Full text link
    Interpretable classification models are built with the purpose of providing a comprehensible description of the decision logic to an external oversight agent. When considered in isolation, a decision tree, a set of classification rules, or a linear model, are widely recognized as human-interpretable. However, such models are generated as part of a larger analytical process. Bias in data collection and preparation, or in model's construction may severely affect the accountability of the design process. We conduct an experimental study of the stability of interpretable models with respect to feature selection, instance selection, and model selection. Our conclusions should raise awareness and attention of the scientific community on the need of a stability impact assessment of interpretable models

    Classes of Terminating Logic Programs

    Full text link
    Termination of logic programs depends critically on the selection rule, i.e. the rule that determines which atom is selected in each resolution step. In this article, we classify programs (and queries) according to the selection rules for which they terminate. This is a survey and unified view on different approaches in the literature. For each class, we present a sufficient, for most classes even necessary, criterion for determining that a program is in that class. We study six classes: a program strongly terminates if it terminates for all selection rules; a program input terminates if it terminates for selection rules which only select atoms that are sufficiently instantiated in their input positions, so that these arguments do not get instantiated any further by the unification; a program local delay terminates if it terminates for local selection rules which only select atoms that are bounded w.r.t. an appropriate level mapping; a program left-terminates if it terminates for the usual left-to-right selection rule; a program exists-terminates if there exists a selection rule for which it terminates; finally, a program has bounded nondeterminism if it only has finitely many refutations. We propose a semantics-preserving transformation from programs with bounded nondeterminism into strongly terminating programs. Moreover, by unifying different formalisms and making appropriate assumptions, we are able to establish a formal hierarchy between the different classes.Comment: 50 pages. The following mistake was corrected: In figure 5, the first clause for insert was insert([],X,[X]

    Variable Ranges in Linear Constraints

    Get PDF
    We introduce an extension of linear constraints, called linearrange constraints, which allows for (meta-)reasoning about the approximation width of variables. Semantics for linearrange constraints is provided in terms of parameterized linear systems. We devise procedures for checking satisfiability and for entailing the maximal width of a variable. An extension of the constraint logic programming language CLP(R) is proposed by admitting linear-range constraints

    A KDD process for discrimination discovery

    Get PDF
    The acceptance of analytical methods for discrimination discovery by practitioners and legal scholars can be only achieved if the data mining and machine learning communities will be able to provide case studies, methodological refinements, and the consolidation of a KDD process. We summarize here an approach along these directions

    A Relational Approach to Networks in a Tourism Destination: Business and Family Networks in San Vito Lo Capo, Sicily

    Get PDF
    This article constructs a relational framework using the principles of the Network Approach to examining the business exchange structure of a tourist destination. Network Analysis is the methodology to analyse the metrics of collaboration and cooperation among destination companies. The model was applied in a remote tourist destination named San Vito Lo Capo on the island of Sicily, where tourism has significantly expanded in the last twenty years. The focus is on how groupings of small companies within family relations can govern and be responsible for tourism destination cooperation. As the main result, the existence was identified, of a relational framework where three clusters of families with a high density of exchanges emerge. These families can influence the tourism business at the destination, guaranteeing cooperation among other business companies. The findings show the existence and the importance of informal business networks and the contribution of Network Analysis to understanding the structure and cohesiveness of a tourist destination

    The Initial Screening Order Problem

    Full text link
    In this paper we present the initial screening order problem, a crucial step within candidate screening. It involves a human-like screener with an objective to find the first k suitable candidates rather than the best k suitable candidates in a candidate pool given an initial screening order. The initial screening order represents the way in which the human-like screener arranges the candidate pool prior to screening. The choice of initial screening order has considerable effects on the selected set of k candidates. We prove that under an unbalanced candidate pool (e.g., having more male than female candidates), the human-like screener can suffer from uneven efforts that hinder its decision-making over the protected, under-represented group relative to the non-protected, over-represented group. Other fairness results are proven under the human-like screener. This research is based on a collaboration with a large company to better understand its hiring process for potential automation. Our main contribution is the formalization of the initial screening order problem which, we argue, opens the path for future extensions of the current works on ranking algorithms, fairness, and automation for screening procedures

    Bounded Nondeterminism of Logic Programs

    Full text link

    Data Mining for Discrimination Discovery

    Get PDF
    In the context of civil rights law, discrimination refers to unfair or unequal treatment of people based on membership to a category or a minority, without regard to individual merit. Discrimination in credit, mortgage, insurance, labor market, and education has been investigated by researchers in economics and human sciences. With the advent of automatic decision support systems, such as credit scoring systems, the ease of data collection opens several challenges to data analysts for the fight against discrimination. In this paper, we introduce the problem of discovering discrimination through data mining in a dataset of historical decision records, taken by humans or by automatic systems. We formalize the processes of direct and indirect discrimination discovery by modelling protected-by-law groups and contexts where discrimination occurs in a classification rule based syntax. Basically, classification rules extracted from the dataset allow for unveiling contexts of unlawful discrimination, where the degree of burden over protected-bylaw groups is formalized by an extension of the lift measure of a classification rule. In direct discrimination, the extracted rules can be directly mined in search of discriminatory contexts. In indirect discrimination, the mining process needs some background knowledge as a further input, e.g., census data, that combined with the extracted rules might allow for unveiling contexts of discriminatory decisions. A strategy adopted for combining extracted classification rules with background knowledge is called an inference model. In this paper, we propose two inference models and provide automatic procedures for their implementation. An empirical assessment of our results is provided on the German credit dataset and on the PKDD Discovery Challenge 1999 financial dataset
    • …
    corecore